08. Activity Classification
Activity Classification Heading
Activity Classification
ND320 C4 L3 09 Activity Classifier
Activity Classification Summary
Now that we've explored the data, examined the literature, chosen our features, and pre-processed all the data. Now it's time to finally build the classifier!
First off, we will do feature extraction to train on 10 second long non-overlapping windows. And we used sklearn to build a random forest classifier to classify our data. Then we defined the hyperparameters with 100 trees where each tree has a maximum depth of 4.
Now we are ready to build and train the model!
But as we just trained on the whole dataset we can't easily evaluate it. But one way to evaluate the performance of a multi-class classifier is to look at a confusion matrix. The confusion matrix shows how many data points were misclassified and what they were misclassified as.
ND320 C4 L3 10 Leave-One-Subject-Out Cross Validation
Activity Classification Recap
Summary
We've explored the data, examined the literature, chosen our features, and pre-processed all the data. Now it's time to finally build the classifier!
In this lesson, we finally train our features to build a random forest model. We talk about model performance and use cross-validation to estimate our accuracy. We end up with a model with an overall classification accuracy of 73%, which is the percent of correct classifications made by the model. But don’t fret, we’ll do better in the next video!
Quiz
image quiz
quiz
SOLUTION:
Calling bananas orangesCode from Video
Notebook Review
If you wanted to interact with the notebook in the video, you can access it here in the repo /activity-classifier/walkthroughs/activity-classifier/
or in the workspace below.
The dataset that will be used throughout this lesson can be found at the top of the lesson directory at /activity-classifier/data/
.
Code
If you need a code on the https://github.com/udacity.
Activity Classification Further Research
Further Resources
Random forests are boosted decision tree models. You need to understand a decision tree before learning what a random forest model is. Start with the sklearn
tutorial on decision trees. Then check out these videos on youtube for a visual explanation:
See this list of classification accuracy metrics that can be computed in sklearn
.
Follow this series of blog posts for an understanding of how these accuracy metrics work on multiclass problems like ours.
Glossary
- Cross-validation: A technique for estimating model performance where multiple models are trained and tested each on a separate partition of the entire dataset.
- Classification accuracy: The percent of correct classifications made by a model.